Terapixel Image Processing and Simulation with Distributed Halide

نویسندگان

  • Tyler Denniston
  • Saman Amarasinghe
  • Leslie A. Kolodziejski
چکیده

Many image processing and simulation tasks are naturally expressed as a pipeline of small computational kernels known as stencils. Halide is a popular domainspecific language and compiler designed to implement stencil algorithms. Halide uses simple language constructs to express what to compute and a separate scheduling co-language for expressing how to perform the computation. This approach has demonstrated performance comparable to or better than hand-optimized code. Until now, Halide has been restricted to parallel shared memory execution, limiting its performance and applicability to tomorrow’s terapixel image processing tasks. In this thesis we present an extension to Halide to support distributed-memory parallel execution of stencil pipelines. These extensions compose with the existing scheduling constructs in Halide, allowing expression of complex computation and communication strategies. Existing Halide applications can be distributed with minimal changes, allowing programmers to explore the tradeoff between recomputation and communication with little effort. Approximately 10 new of lines code are needed even for a 200 line, 99 stage application. On nine image processing benchmarks, my extensions give up to a 1.4× speedup on the same number of cores over regular multithreaded execution by mitigating the effects of non-uniform memory access. The image processing benchmarks achieve up to 18× speedup on a 16 node testing machine and up to 57× speedup on 64 nodes of the NERSC Cori supercomputer. A 3D heat finite-difference simulation benchmark achieves linear scaling from 64 to 512 Cori nodes on a 10, 0003, or 1 terapixel, input. We also demonstrate scalability results for two of the image processing benchmarks on 1 terapixel inputs, and make the argument that supporting such large scale is essential for tomorrow’s image processing and simulation needs. Thesis Supervisor: Saman Amarasinghe Title: Professor of Electrical Engineering and Computer Science

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data-intensive science: The Terapixel and MODISAzure projects

We live in an era in which scientific discovery is increasingly driven by data exploration of massive datasets. Scientists today are envisioning diverse data analyses and computations that scale from the desktop to supercomputers, yet often have difficulty designing and constructing software architectures to accommodate the heterogeneous and often inconsistent data at scale. Moreover, scientifi...

متن کامل

Terapixel Imaging of Cosmological Simulations

The increasing size of cosmological simulations has led to the need for new visualization techniques. We focus on Smoothed Particle Hydrodynamical (SPH) simulations run with the GADGET code and describe methods for visually accessing the entire simulation at full resolution. The simulation snapshots are rastered and processed on supercomputers into images that are ready to be accessed through a...

متن کامل

Optimal Control of the Vehicle Path Following by Using Image Processing Approach

Nowadays, the importance of the vehicles and its dramatic effects on human life is no secret. The use of trailers with multiple axels for transporting bulky and heavy equipment is essential. Increase in trailers axles which results increment of wheels, needs considerations in order to increase in transporting speed, maneuverability, safety, better control and path following accurately.  Therefo...

متن کامل

Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines Citation

Image processing pipelines combine the challenges of stencil computations and stream programs. They are composed of large graphs of different stencil stages, as well as complex reductions, and stages with global or data-dependent access patterns. Because of their complex structure, the performance difference between a naive implementation of a pipeline and an optimized one is often an order of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016